20 research outputs found

    An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

    Full text link
    Modern OpenMP threading techniques are used to convert the MPI-only Hartree-Fock code in the GAMESS program to a hybrid MPI/OpenMP algorithm. Two separate implementations that differ by the sharing or replication of key data structures among threads are considered, density and Fock matrices. All implementations are benchmarked on a super-computer of 3,000 Intel Xeon Phi processors. With 64 cores per processor, scaling numbers are reported on up to 192,000 cores. The hybrid MPI/OpenMP implementation reduces the memory footprint by approximately 200 times compared to the legacy code. The MPI/OpenMP code was shown to run up to six times faster than the original for a range of molecular system sizes.Comment: SC17 conference paper, 12 pages, 7 figure

    Developing a Computational Chemistry Framework for the Exascale Era

    Get PDF
    Within computational chemistry, the NWChem package has arguably been the de facto standard for running high-accuracy numerical simulations on the most powerful supercomputers. In order to better address the challenges presented by emerging exascale architectures, the decision has been made to rewrite NWChem. Design of the resulting package, NWChemEx, has been driven by exascale computing; however, significant additional design considerations have arisen from the team\u27s involvement with the Molecular Sciences Software Institute (MolSSI). MolSSI is a National Science Foundation initiative focused on establishing coding and data standards for the computational chemistry community. As a result, NWChemEx is built upon a general computational chemistry framework called the simulation development environment (SDE) that is designed with a focus on extensibility and interoperability. The present manuscript describes the modular approach of the SDE and how it has been used to implement the self-consistent field algorithm within NWChemEx

    Knowledge is power: Quantum chemistry on novel computer architectures

    No full text
    In the first chapter of this thesis, a background of fundamental quantum chemistry concepts is provided. Chapter two contains an analysis of the performance and energy efficiency of various modern computer processor architectures while performing computational chemistry calculations. In chapter three, the processor architectural study is expanded to include parallel computational chemistry algorithms executed across multiple-node computer clusters. Chapter four describes a novel computational implementation of the fundamental Hartree-Fock method which significantly reduces computer memory requirements. In chapter five, a case study of quantum chemistry two-electron integral code interoperability is described. The final chapters of this work discuss applications of quantum chemistry. In chapter six, an investigation of the esterification of acetic acid on acid-functionalized silica is presented. In chapter seven, the application of ab initio molecular dynamics to study the photoisomerization and photocyclization of stilbene is discussed. Final concluding remarks are noted in chapter eight.</p

    Analyzing the Performance and Accuracy of Lossy Checkpointing on Sub-Iteration of NWChem

    No full text
    Future exascale systems are expected to be characterized by more frequent failures than current petascale systems. This places increased importance on the application to minimize the amount of time wasted due to recompution when recovering from a checkpoint. Typically HPC application checkpoint at iteration boundaries. However, for applications that have a high per-iteration cost, checkpointing inside the iteration limits the amount of re-computation. This paper analyzes the performance and accuracy of using lossy compressed check-pointing in the computational chemistry application NWChem. Our results indicate that lossy compression is an effective tool for reducing the sub-iteration checkpoint size. Moreover, compression error tolerances that yield acceptable deviation in accuracy and iteration count are quantified

    Dynamics Simulations with Spin-Flip Time-Dependent Density Functional Theory: Photoisomerization and Photocyclization Mechanisms of cis-Stilbene in ππ* States

    No full text
    On-the-fly dynamics simulations were carried out using spin-flip time dependent density functional theory (SF-TDDFT) to examine the photoisomerization and photocyclization mechanisms of cis-stilbene following excitation to the ππ* state. A state tracking method was devised to follow the target state among nearly degenerate electronic states during the dynamics simulations. The steepest descent path from the Franck–Condon structure of cis-stilbene in the ππ* state is shown to reach the S1-minimum of 4,4-dihydrophenanthrene (DHP) via a cis-stilbene-like structure (referred to as (S1)cis-min) on a very flat region of the S1-potential energy surface. From the dynamics simulations, the branching ratio of the photoisomerization is calculated as trans:DHP = 35:13, in very good agreement with the experimental data, trans:DHP = 35:10. The discrepancy between the steepest descent pathway and the significant trans-stilbene presence in the branching ratio observed experimentally and herein computationally is clarified from an analysis of geometrical features along the reaction pathway, as well as the low barrier of 0.1 eV for the pathway from (S1)cis-min to the twisted pyramidal structure on the S1-potential energy surface. It is concluded that ππ*-excited cis-stilbene propagates primarily toward the twisted structural region due to dynamic effects, with partial branching to the DHP structural region via the flat-surface region around (S1)cis-min.Reprinted (adapted) with permission from Journal of Physical Chemistry A 118 (2014): 11987, doi:10.1021/jp5072428. Copyright 2014 American Chemical Society.</p

    Energy-Efficient Computational Chemistry: Comparison of x86 and ARM Systems

    No full text
    The computational efficiency and energy-to-solution of several applications using the GAMESS quantum chemistry suite of codes is evaluated for 32-bit and 64-bit ARM-based computers, and compared to an x86 machine. The x86 system completes all benchmark computations more quickly than either ARM system and is the best choice to minimize time to solution. The ARM64 and ARM32 computational performances are similar to each other for Hartree–Fock and density functional theory energy calculations. However, for memory-intensive second-order perturbation theory energy and gradient computations the lower ARM32 read/write memory bandwidth results in computation times as much as 86% longer than on the ARM64 system. The ARM32 system is more energy efficient than the x86 and ARM64 CPUs for all benchmarked methods, while the ARM64 CPU is more energy efficient than the x86 CPU for some core counts and molecular sizes.Reprinted (adapted) with permission from Journal of Chemical Theory and Computation 11 (2015): 5055, doi:10.1021/acs.jctc.5b00713. Copyright 2015 American Chemical Society.</p
    corecore